首页> 外文OA文献 >Tournament selection in zeroth-level classifier systems based on average reward reinforcement learning
【2h】

Tournament selection in zeroth-level classifier systems based on average reward reinforcement learning

机译:基于平均值的第零级分类器系统中的锦标赛选择   奖励强化学习

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。
获取外文期刊封面目录资料

摘要

As a genetics-based machine learning technique, zeroth-level classifiersystem (ZCS) is based on a discounted reward reinforcement learning algorithm,bucket-brigade algorithm, which optimizes the discounted total reward receivedby an agent but is not suitable for all multi-step problems, especiallylarge-size ones. There are some undiscounted reinforcement learning methodsavailable, such as R-learning, which optimize the average reward per time step.In this paper, R-learning is used as the reinforcement learning employed byZCS, to replace its discounted reward reinforcement learning approach, andtournament selection is used to replace roulette wheel selection in ZCS. Themodification results in classifier systems that can support long action chains,and thus is able to solve large multi-step problems.
机译:零级分类器系统(ZCS)是基于遗传学的机器学习技术,它基于折扣奖励强化学习算法(Bucket-brigade算法),可优化代理收到的折扣总奖励,但不适用于所有多步问题,尤其是大型的。有一些不折不扣的强化学习方法,例如R学习,可以优化每个时间步的平均奖励。本文将R学习用作ZCS的强化学习,以取代其折扣式奖励强化学习方法和竞赛选择。用于替换ZCS中的轮盘选择。修改产生的分类器系统可以支持较长的动作链,因此能够解决大型多步骤问题。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号